Picture for Xu Cao

Xu Cao

Toward Cognitive Supersensing in Multimodal Large Language Model

Add code
Feb 02, 2026
Viaarxiv icon

KaoLRM: Repurposing Pre-trained Large Reconstruction Models for Parametric 3D Face Reconstruction

Add code
Jan 19, 2026
Viaarxiv icon

PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation

Add code
Jan 11, 2026
Viaarxiv icon

HOLO: Homography-Guided Pose Estimator Network for Fine-Grained Visual Localization on SD Maps

Add code
Jan 07, 2026
Viaarxiv icon

EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR

Add code
Aug 20, 2025
Figure 1 for EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR
Figure 2 for EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR
Figure 3 for EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR
Figure 4 for EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR
Viaarxiv icon

Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Add code
May 27, 2025
Figure 1 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 2 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 3 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 4 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Viaarxiv icon

Sage Deer: A Super-Aligned Driving Generalist Is Your Copilot

Add code
May 15, 2025
Viaarxiv icon

PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction

Add code
Apr 14, 2025
Figure 1 for PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction
Figure 2 for PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction
Figure 3 for PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction
Figure 4 for PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction
Viaarxiv icon

SocialGesture: Delving into Multi-person Gesture Understanding

Add code
Apr 03, 2025
Viaarxiv icon

STAMICS: Splat, Track And Map with Integrated Consistency and Semantics for Dense RGB-D SLAM

Add code
Mar 27, 2025
Viaarxiv icon